Feature selection for document classification based on topology
نویسندگان
چکیده
منابع مشابه
Feature Selection Technique for Text Document Classification: An Alternative Approach
Text classification and feature selection plays an important role for correctly identifying the documents into particular category, due to the explosive growth of the textual information from the electronic digital documents as well as world wide web. In the text mining present challenge is to select important or relevant feature from large and vast amount of features in the data set. The aim o...
متن کاملA Comprehensive Filter Feature Selection for Improving Document Classification
High dimension of bag-of-words vectors poses a serious challenge from sparse data, overfitting, irrelevant features to document classification. Filter feature selection is one of effective methods for dimensionality reduction by removing irrelevant features from feature set. This paper focuses on two main problems of filter feature selection which are the feature score computation and the imbal...
متن کاملFeature Selection for the Classification of Large Document Collections
Feature selection methods are often applied in the context of document classification. They are particularly important for processing large data sets that may contain millions of documents and are typically represented by a large number, possibly tens of thousands of features. Processing large data sets thus raises the issue of computational resources and we often have to find the right trade-o...
متن کاملA Real-Time Electroencephalography Classification in Emotion Assessment Based on Synthetic Statistical-Frequency Feature Extraction and Feature Selection
Purpose: To assess three main emotions (happy, sad and calm) by various classifiers, using appropriate feature extraction and feature selection. Materials and Methods: In this study a combination of Power Spectral Density and a series of statistical features are proposed as statistical-frequency features. Next, a feature selection method from pattern recognition (PR) Tools is presented to e...
متن کاملA Novel Scheme for Improving Accuracy of KNN Classification Algorithm Based on the New Weighting Technique and Stepwise Feature Selection
K nearest neighbor algorithm is one of the most frequently used techniques in data mining for its integrity and performance. Though the KNN algorithm is highly effective in many cases, it has some essential deficiencies, which affects the classification accuracy of the algorithm. First, the effectiveness of the algorithm is affected by redundant and irrelevant features. Furthermore, this algori...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Egyptian Informatics Journal
سال: 2018
ISSN: 1110-8665
DOI: 10.1016/j.eij.2018.01.001